Model Selection

XLSR-53 fine-tuning

# XLSR-53 fine-tuning

Wav2vec2 Large Xlsr Deepfake Audio Classification

An audio classification model based on the wav2vec2 architecture, fine-tuned for deepfake audio detection tasks, excelling in gender recognition and fake audio detection.

Audio Classification

Wav2vec2 Large Xlsr 53 Amharic

This model is an automatic speech recognition (ASR) model fine-tuned on Amharic speech corpus based on facebook/wav2vec2-large-xlsr-53.

Speech Recognition

Transformers Other

Exp W2v2t It Xlsr 53 S387

An Italian automatic speech recognition model fine-tuned based on the facebook/wav2vec2-large-xlsr-53 model, trained using the Common Voice 7.0 Italian dataset.

Speech Recognition

Transformers Other

Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V7

An automatic speech recognition model based on wav2vec2-large-xlsr-53, specifically optimized for StepMania game audio, fine-tuned on the GARY109/AI_LIGHT_DANCE dataset

Speech Recognition

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 2

An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, trained on the GARY109/AI_LIGHT_DANCE dataset

Speech Recognition

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V3

An automatic speech recognition model fine-tuned based on wav2vec2-large-xlsr-53, specializing in singing voice recognition

Speech Recognition

Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V6

This model is an automatic speech recognition (ASR) model fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-STEPMANIA2 dataset based on wav2vec2-large-xlsr-53.

Speech Recognition

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 5gram V4 1

This model is an automatic speech recognition (ASR) model based on the wav2vec2-large-xlsr-53 architecture, fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset, primarily used for singing voice recognition tasks.

Speech Recognition

Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V3

Automatic speech recognition model based on wav2vec2-large-xlsr-53, fine-tuned on the GARY109/AI_LIGHT_DANCE dataset

Speech Recognition

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53 V1

This model is an automatic speech recognition model fine-tuned on the GARY109/AI_LIGHT_DANCE - ONSET-SINGING2 dataset based on wav2vec2-large-xlsr-53, primarily used for singing voice recognition tasks.

Speech Recognition

Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V2

This model is an automatic speech recognition model fine-tuned on the GARY109/AI_LIGHT_DANCE dataset based on wav2vec2-large-xlsr-53

Speech Recognition

Ai Light Dance Stepmania Ft Wav2vec2 Large Xlsr 53 V1

This model is an automatic speech recognition model fine-tuned from wav2vec2-large-xlsr-53 on the GARY109/AI_LIGHT_DANCE - ONSET-STEPMANIA2 dataset.

Speech Recognition

Ai Light Dance Singing2 Ft Wav2vec2 Large Xlsr 53

This model is an automatic speech recognition model fine-tuned on the AI Light Dance dataset based on facebook/wav2vec2-large-xlsr-53.

Speech Recognition

Ai Light Dance Chord Ft Wav2vec2 Large Xlsr 53

This model is a fine-tuned automatic speech recognition model based on facebook/wav2vec2-large-xlsr-53 on the GARY109/AI_Light_Dance - ONSET-CHORD2 dataset.

Speech Recognition

Ai Light Dance Singing Ft Wav2vec2 Large Xlsr 53

This model is an automatic speech recognition model fine-tuned on the AI_LIGHT_DANCE - ONSET-SINGING dataset based on facebook/wav2vec2-large-xlsr-53, primarily used for singing voice recognition tasks.

Speech Recognition

Wav2vec2 Large Multilang Cv Ru

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the common_voice dataset, primarily designed for Russian speech recognition tasks.

Speech Recognition

Wav2vec2 Large Xlsr 53 Tr Fine Tuning Deprecated

This model is a speech recognition model fine-tuned on the Common Voice Turkish dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Wav2vec2 Large Xlsr 53 842h Luxembourgish 14h

A large wav2vec2.0 model fine-tuned with 842 hours of unlabeled and 14 hours of labeled Luxembourgish speech data, supporting Luxembourgish speech recognition

Speech Recognition

Transformers Other

This model is a speech recognition model fine-tuned on an unknown dataset based on facebook/wav2vec2-large-xlsr-53, supporting recognition of Arabic dialects (Arabizi).

Speech Recognition

German speech recognition model fine-tuned based on jonatasgrosman/wav2vec2-large-xlsr-53-german

Speech Recognition

Wav2vec2 Common Voice Lithuanian

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the COMMON_VOICE - LT dataset for Lithuanian speech recognition.

Speech Recognition

Transformers Other

A fine-tuned Portuguese automatic speech recognition model based on jonatasgrosman/wav2vec2-large-xlsr-53-portuguese

Speech Recognition

Wav2vec2 Common Voice Ab Demo

A speech recognition model fine-tuned on the COMMON_VOICE - AB dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

patrickvonplaten

Wav2vec2 Common Voice Tr Demo

This model is an automatic speech recognition (ASR) model fine-tuned on the COMMON_VOICE SV-SE dataset based on facebook/wav2vec2-large-xlsr-53, supporting Swedish speech recognition.

Speech Recognition

Wav2vec2 Large Xlsr 53 W2V2 TATAR SMALL

This model is a Tatar automatic speech recognition model fine-tuned on the Common Voice 8 dataset based on facebook/wav2vec2-large-xlsr-53, with a test set WER of 53.16%.

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 German

This is a fine-tuned XLSR-53 large model for German speech recognition tasks, based on Facebook's wav2vec2-large-xlsr-53 model and fine-tuned on the Common Voice 6.1 German dataset.

Speech Recognition German

Wav2vec2 Large Xlsr 53 Finnish

Finnish speech recognition model fine-tuned based on XLSR-53 large model, supports 16kHz audio input

Speech Recognition Other

Wav2vec2 Xlsr Khmer

A Khmer speech recognition model fine-tuned on the facebook/wav2vec2-large-xlsr-53 model, achieving a WER of 24.96% on the OpenSLR Khmer dataset.

Speech Recognition Other

Wav2vec2 Large Xlsr 53 French

This is a French speech recognition model fine-tuned from the XLSR-53 large model, trained on the Common Voice dataset, supporting high-accuracy French speech-to-text conversion.

Speech Recognition French

Wav2vec2 Large Xlsr 53 Euskera

A speech recognition model fine-tuned on the Basque language (Euskera) using the Common Voice dataset, based on the facebook/wav2vec2-large-xlsr-53 model.

Speech Recognition Other

Wav2vec2 Xlsr 53 Rm Vallader With Lm

Romansh Vallader dialect speech recognition model based on wav2vec2-xlsr-53 with language model support

Speech Recognition

Fb Vindata Vi Large

This model is a Vietnamese automatic speech recognition model fine-tuned on the PHONGDTD/VINDATAVLSP - NA dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Wav2vec2 Large Xlsr Kn

This is an automatic speech recognition (ASR) model fine-tuned on Kannada language using Facebook's wav2vec2-large-xlsr-53 model, trained with the OpenSLR SLR79 dataset.

Speech Recognition Other

Wav2vec2 Large Xlsr 53 Estonian

Estonian speech recognition model fine-tuned from Facebook's XLSR-53 large model, achieving 30.74% word error rate on Common Voice dataset

Speech Recognition Other

Wav2vec2 Xlsr 53 Tamil

A Tamil speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained on the Common Voice Tamil dataset.

Speech Recognition Other

Wav2vec2 Large Xlsr 53 Spanish

A Spanish speech recognition model fine-tuned from facebook/wav2vec2-large-xlsr-53, trained on the Common Voice 6.1 Spanish dataset

Speech Recognition Spanish

Wav2vec2 Large Xlsr 53 Greek

This is a large XLSR-53 model fine-tuned for Greek speech recognition tasks, based on facebook/wav2vec2-large-xlsr-53 model, trained using Common Voice 6.1 and CSS10 datasets.

Speech Recognition Other

Wav2vec2 Large Xlsr 53 Dutch

A Dutch speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, trained on the Common Voice and CSS10 datasets, supporting 16kHz audio input.

Speech Recognition Other

Wav2vec2 Large Xlsr 53 Sah CV8

A speech recognition model fine-tuned on the Common Voice Yakut dataset based on facebook/wav2vec2-large-xlsr-53

Speech Recognition

Transformers Other

Wav2vec2 Large Xlsr 53 Sakha

Yakut speech recognition model fine-tuned from XLSR-53 large model, with 32.23% word error rate

Speech Recognition Other

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase